Percentiles of sums of heavy-tailed random variables: 
Beyond the single-loss approximation. 
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Abstract 



A perturbative approach is used to derive approximations 
of arbitrary order to estimate high percentiles of sums of 
positive independent random variables that exhibit heavy 
tails. Closed-form expressions for the successive approxi- 
mations are obtained both when the number of terms in 
the sum is deterministic and when it is random. The ze- 
roth order approximation is the percentile of the maximum 
term in the sum. Higher orders in the perturbative se- 
ries involve the right-truncated moments of the individual 
random variables that appear in the sum. These censored 
moments are always finite. As a result, and in contrast 
to previous approximations proposed in the literature, the 
perturbative series has the same form regardless of whether 
these random variables have a finite mean or not. For high 
percentiles, and specially for heavier tails, the quality of 
the estimate improves as more terms are included in the 
series, up to a certain order. Beyond that order the conver- 
gence of the series deteriorates. Nevertheless, the approx- 
imations obtained by truncating the perturbative series at 
intermediate orders are remarkably accurate for a variety 
of distributions in a wide range of parameters. 

Keywords: Subexponential distributions, Heavy tails, 
Percentile estimation, Aggregate loss distribution, Cen- 
sored moments, Value at Risk 



1 Introduction 

In this article we derive accurate closed-form approxima- 
tions for high percentiles of sums of positive indepen- 
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dent identically distributed random variables (iidrv's) with 
heavy tails. This is an important computational task in 
applications such as wireless communications [JJ, workload 
process [2J [3] and in the quantification of risk in insurance 
and finance [4j [5] . A particularly important application in 
finance is the quantification of operational risk [6l [Tj [8J [9] . 

There are several numerical procedures to estimate per- 
centiles of sums of iidrv's random variables: the Panjer 
recursion algorithm, a method based on the Fast Fourier 
Transform, and Monte Carlo simulation jTOJ |H1 [H] ■ These 
numerical techniques are efficient and yield accurate esti- 
mates of high percentiles of sums of random variables pro- 
vided that these are not too heavy-tailed: their computa- 
tional cost increases as the tails of the probability distribu- 
tion become heavier, and eventually become impracticable. 
When Monte Carlo simulation is used, this difficulty can be 
addressed using variance reduction techniques [T2J [T3] . 

In this work we take a different approach and derive closed- 
form approximations for high percentiles of the aggregate 
distribution based on a perturbative expansion. The ze- 
roth order term in the perturbative expansion is similar to 
the single- loss approximation |14) . which assumes that the 
sum is dominated by the maximum. This dominance in the 
sum by the maximum is a property of subexponential dis- 
tributions, a subclass of heavy-tailed distributions [131 ITS] . 
These types of distributions appear in important areas of 
application, such as insurance and finance [1], hydrology 
[T7] , queueing models [TBI HH] , the characterization of the 
Internet [5D] , and other areas of application [3T] . 

The first order perturbative approximation, which includes 
the zeroth order term plus a first order correction, is similar 
to approximations that can be derived from the asymp- 
totic tail behavior of sums of subexponential variables 
[221 H3J HU HS1 [HJ [13 HH1 I2H 130] - Assuming that the mean 
of the individual random variables in the sum is finite, these 



approximations are all similar to the mean-corrected single- 
loss formula, which was proposed by [31] using heuristic ar- 
guments. In this article we provide an explicit procedure 
to derive higher order terms in the perturbative expansion, 
which provides a more accurate approximation to high per- 
centiles of sums of positive iidrv's. 

The perturbative series introduced in this article differs in 
important aspects from previous approximations proposed 
in the literature. In particular, the terms in the pertur- 
bative series are expressed as a function of the moments 
of the right-truncated distribution for the individual rv's 
in the sum. These censored moments exist even when the 
moments of the original distribution (without truncation) 
diverge. Consequently, the same expression is valid for both 
the finite and infinite mean cases. For high percentiles, the 
perturbative expansion provides a sequence of approxima- 
tions that, up to certain order, has increasing quality as 
more terms are included. Beyond that order the conver- 
gence of the series deteriorates. 

The article is organized as follows: section [2] presents the 
derivation of a perturbative expansion for the percentile of 
sums of two random variables. This expansion is then ap- 
plied to the estimation of high percentiles of sums of N 
independent random variables in section [3] The key idea is 
to treat separately the maximum and the remaining terms 
in the sum. Explicit formulas are derived when N, the num- 
ber of terms in the sum, is either deterministic or stochastic. 
Section 2] reviews the approximations for high percentiles of 
sums of iidrv's that have been proposed in the literature. 

The accuracy of the perturbative series is illustrated in sec- 
tion[5]by comparing with exact results or with Monte Carlo 
estimates, if closed-form expressions are not available. Fi- 
nally, section [5] summarizes the contributions of this work 
and discusses the perspectives for further research. 



2 Perturbative expansion for the per- 
centiles of the sum of two random vari- 
ables 

In this section we derive a perturbative expansion of the 
percentile of a sum of two random variables. The zeroth 
order term in the perturbative series is the percentile of one 
of the variables in the sum. Higher order terms involve the 
moments of the second variable, conditioned to the first one 
having a fixed value. In the following section, these general 
expressions are applied to the particular case sums of N 
random variables by identifying the first random variable 
with the maximum in the sum and the second one with the 
remainder. 



Let X and Y be two rv's whose joint distribution function 
is Fx,y(x,u) (density fx,y{x,y)). Consider the random 
variable 

Z = X + eY, (1) 

whose probability distribution is Fz(z) (density fz(z)). It 
is not possible to express this distribution in a closed form 
that does not involve a convolution, except in special cases 
[321 133] . Let Q Q = F^ia) and Q = F^ x {a) be the a- 
percentiles of X and Z, respectively. The percentile of Z at 
probability level a can be formally represented by a power 
series in e 



Q = Qo + SQ = 



Es 1 

fc=i 



/fee 



(2) 



The approximation of order K to Q is the result of keeping 
only the first K + 1 terms in the series 



K 



Q^ = Q a +Y,Qk£ k /kl 



(3) 



fc=i 



Explicit expressions for the zeroth and first coefficients in 
@ have been derived in [33] > in the context of credit risk. 
Also in this context, [35] give an explicit expression for the 
derivatives d n Fz{z)/de n , which are used in the perturbative 
expansion in e for Fz(z), the CDF of the sum. Our goal in 
this section is to derive a general expression for the terms in 
a perturbative expansion of the percentile (i.e. the inverse 
function F^" 1 (a)). 

The starting point of the derivation is the identity 



= Fz(Q)-F x (Q Q ) 



dy I dxfx,r(x,y). 

'Qo 



(4) 



For a sufficiently smooth f(x), one can define the operators 



00 fk nfc 
fc=0 

d~ l f(x) = f duf(u), 



(5) 



where d x 



d_ 

dx ' 



and their composition 



(e td *-l)d- 1 f(x) = 



x+t 



f(u)du. 



(6) 



In terms of these operators 

rQ-cy 

dxf x ,Y(x,y) = 



Qo 



Qa+SQ-ey 



Q.i 



dxfx,v(x,y) 



= (<5Q-ey)<9x 



V)d- l f XiY {x,y) 



(7) 



Using this result, ((3]) can be expressed as 



x=Q a 



first four terms in the perturbative series are 

Q =F x \a) 
(8) Q 1 =E[Y\X = Q ] 



Expanding the exponential operator in a formal Taylor 
power series and using the definition of the complete Bell 
polynomials (Appendix [21 eq. (|80jl) this expression be- 
comes 
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fx(Qo) 



d y^h 



'-°° fc=l 

-Bfe ((<9i - y) d x , QA, ■■■, Qkd x ) d~ l fxx( x > v) 



x=Qo 



Qi 



x=Qo 

x) 



(9) 

Since this equality holds for all e, each coefficient in the sum 
must be zero separately. This yields the system of equations 







dy 



Bk ((Qi - y) d x ,Q 2 d x , ..., Q k dx) d x 1 fx,y(x, y) 
for k > 1. 



x=Qo 



. (10) 

Explicit expressions for Q/~ can be derived in terms of Ck, a 
centered version of the Bell polynomials (Appendix [21 eq. 

(USD) 

Q 1 =E[r|A = Q ] 

l 



Q fc 



fx(Qo) 



fc-2 

+ E 



Z ( J C ^-dQ2dx, ■ • ■ , Q k -id x )d x - x {fx(x)M l (x)} 



fc-1 
i- 1 



i=2 

for fc > 2, 



QiC k -i{Q 2 d Xl ..., Qk-id x )fx{x) 



3Q a 5x(/x(a:)Mi(a:))} 

-sskjW'*'**' 

6Q 2 ^ (/x(x)M 2 (x)) + 4Q 3 ffl (/iWMi(i)) + 

30^ /x (a?)} • (13) 

The term Q 2 can be expressed in terms of the conditional 
variance (Var[F|A = x]) instead of M 2 (x) because, for 
this particular term, Q\ can be replaced by E[Y"|X = x]. 
This substitution is not possible in general for higher order 
terms. 

These general expressions for the terms in a perturbative 
expansion of the percentiles of the sum of two random vari- 
ables will be applied in the following section to sums of N 
independent random variables, where N can be determin- 
istic or stochastic. 



3 Perturbative expansion around the per- 
centile of the maximum 

In this section (|13p is used to estimate high percentiles of 
the sums of independent random variables with heavy tails 



where 



M i (x) = E[(Q 1 -Y) i \X = x] 



= E(-)(- 1 )^r' M J ( a 



3=0 



x=Q 
(11) 

(12) 



N 

Zn = 2_^ Li 



(14) 



where {L{\f =1 are positive iidrv's sampled from F(l) (the 
corresponding density is /(/)). Let G(z) be the probabil- 
ity distribution of the sum Z^, and g[z) the corresponding 
density. The key idea is to partition the sum into two con- 
tributions: the maximum and the sum of the remaining 
terms 



Mj{x) = E[Y j \X = x\. 

These recursive formulas for the coefficients and for C& can 
be used to compute the approximation to the percentile Q 
to any order in e. However, the complexity of the explicit 
formulas for the coefficients increases with their order. The 
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where Lu-i is the i-th order statistic of the sample {L>i} i=1 
(i.e. Lyn < L<2] < ■■• < Lr N i). The formal parame- 
ter e is introduced to order the terms in the perturbative 
expansion. It is eventually set to one (e = 1), so that 
Zn(1) — Zjy. As shown in Appendix [Dl the perturbative 
series truncated to first order provides an estimate that is 
similar to approximations that can be derived from the tail 
behavior of sums of subexponential variables [25J I2S1 ISO] • 
Therefore, the analysis presented in [221 H3] can be used to 
establish the asymptotic properties of this approximation. 
The issue of convergence of the perturbative series outside 
of the asymptotic regime is analyzed empirically in section 
[5j Qualitatively, the perturbation term in (|15p is small if 
L[n] ^> Sj=i L\i]', that is, when the sum (fT4")) is dominated 
by the maximum. This is the case when the probability dis- 
tribution of L is subexponential, provided that the value of 
the sum is sufficiently large [TSJ[TB]. In consequence, the 
perturbative series should be more accurate for high per- 
centiles. The empirical analysis carried out reveals that, 
for sufficiently high percentiles, the accuracy of the approx- 
imation initially improves as more terms are included in the 
series. However, beyond a certain order the approximation 
actually becomes worse when further terms are used, which 
indicates that, in the cases studied, the perturbative series 
is not convergent. 



The probability distribution of the maximum Lt N i is 



F [N] (x) = F(x) 



N 



(16) 



The corresponding density is obtained by taking the deriva- 
tive of 111 



f [N] (x) = NF(xf- 1 f(x). 



(17) 



In terms of these, the perturbative expansion (jlip becomes 
Q =F~ x {a^) 



Qi 

Q k 

k 
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7 , L[%]\L[n] = Qo 



f[N](Qo) 



= 1 
fe-2 



Y, ( ' i )Ck-i(Q2d x ,...,Q k - i d x )di- 1 {f[N](x)Mi(x)} 
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i=2 

for k > 2, 



+ ^2{ •_-. }Qi C k-i(Q2d x , . . . ,Qk-id x )f[N](%) 



(18) 



with 



Mi{x) 



N-l 

Qx - 2J L w 

fc=i 



L 



[N] 



3=0 



(19) 



Eujc-Wr^), 



where Mj (x) is the j th conditional moments of the random 
variable Y N = J2i=i L [i\ 



Mj(x) 



'N-l 

zZh 

vi=l 



L 



[N] 



(20) 



These closed-form expressions for the terms in the pertur- 
bative series (fl~8|) are the main contribution of this research. 
Explicit formulas for the conditional moments (|20p can be 
readily obtained using the invariance of 7~3i=i -^H] under 
an arbitrary permutation of the indices 



JV-l 

M,(x)=E[(^L [ , 

1=1 

JV-l 



L[N] = x 



{£. < *}« 



i=i 



(21) 
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f(U 



The last quadrature is the average of the jth power of the 
sum of N— 1 independent random variables {Li} i S[ , whose 
joint distribution is 



JV-l 



f{{h}ZT\&<*}?Ji 1 ) = I[f(m<x 

i=l 



(22) 



1 F(x) 



x t^ j , 



rJV-l 



where IIi=i 0{x — U) is a product of Heaviside step func- 
tions, which is equal to 1 in the region {k < x}^ and 
outside this region. Using the definition of the complete 
Bell polynomials (I8T1) , it is possible to express the jth mo 



^N-l 



ment of the sum X)i=i h> where the terms in the sum are 



,/v-i 



constrained to be in the region {U < x} i=1 , 

M j (x) = B j (K 1 (x),...,K j (x)), (23) 

in terms of the conditional cumulants Kj{x), defined as 






log( / dye sy f YN \x N {y\x) 



(24) 



s=0 



Finally, using the property that the pth cumulant of a sum 
of independent variables is the sum of the pth cumulants of 
the individual variables 



K p (x) = (N - 1)k p (x), p = l,2, 



we obtain 



(25) 



3.1 Sums with a random number of terms 

In many applications the quantities of interest are aggregate 
random variables consisting of a variable number of terms 



A ! 



Zn — 2_^ Li 



(33) 



Mj(x) = Bj ((TV - l)^i(x), ...,(N- l)^-(x)) , (26) 
where Kj (x) is the jth censored cumulant of L 



Kj(x) 



dsJ 



h - { ', d "*m 



where N is a discrete random variable whose probability 
mass function is 



P[N = n] = p n , n = 0, . . . , oo. 



(34) 



(27) 



These censored cumulants can also be expressed in terms 
of the censored moments of L 



In insurance and operational risk [H [5], where Zn repre- 
sents the aggregate loss in a fixed time period (e.g. yearly 
losses), N is referred to as the frequency of the loss events. 
For convenience, we will use this term to refer to N in the 
remainder of the article. 



j'-i 



Kj(x) 



fj,j(x) 



ih 



F(x) 



i-i 



Consider the random variable Z 



N 



X 



N 



Yn, with 



Kj-i(x)fii(x), (28) 



for j = 1,2,... (29) 



X N — L[ N ] 

N-l 

y n = J2h 



(35) 
(36) 



Using these relations, it is possible to derive explicit formu- 
las for the terms in the perturbative series. In particular, 
the first three are 






F-\a7r) 

(N - 1)E [L\L < Qo] 

N- 1 
^FiQof-'fiQo) 



(30) 
(31) 



d x [F(x) 

-(N-r 

(JV-2) 



JV-l 



/(sc)Vax [L\L < 



x=Qo 



/(Qo) , /'(Qo) 



F(Qo) /(Qo) 



Var[L|L<Q ] 



m|(Q -E[L|L<Q ]) 2 



(32) 



An attractive feature of this expansion is that the approxi- 
mation of order K depends only on the censored moments 
of F of order lower or equal to K. Since they are censored, 
these always exist, even for distributions whose moments 
diverge. These expressions have been obtained for cases 
in which the number of terms in the sum (|14[) is fixed. In 
the next section, we derive closed-form expressions for sums 
with a random number of terms. 



as in (|14I15[) , where N is now a integer random variable. 
We denote X n = Lr n i and Y„ = Yl7=i -%] tne correspond- 
ing random variables conditional on a fixed value N = n. 
In terms of the probability distribution of £[„], the prob- 
ability distribution of the maximum of the n terms in the 
sum (F[„](x) = F(x) n ) 7 and of the corresponding density 
(f[n](x) = nF(x) n ^ 1 /(#)), the probability distribution and 
the density of L<n] are 



F 



[N][ 



^2p n F [n] (x) f[N](x) = ^2p n f[n]{x), (37) 



respectively. 

For random N the zeroth order term in the perturbative 
expansion Qo satisfies the relation 



a = F [N] (Q Q ) = X>„F H (Q ) = ^>,^(Qo) r 

n=0 n=0 

= £^0)^] =M(\ogF(Q )), 



(38) 



where Mn(s) is the moment generating function of the 
random variable N 



Mn(s) = E [e sN ] =Y,Pne s 



(39) 



n=0 



Using this definition we can invert 



(40) 



Using this expression the coefficients become 



Starting from (1101) with k = 1 it is possible derive an 
expression for the first term in the perturbative series in 
terms of Qq 
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(41) 
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Using the explicit form of the probability distribution of the 
maximum and equation (|21[) , we get 



£( fc !)ao, j...)a„i,-) 
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for A; > 2, 
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with 



E[jV(JV-l)F jv (g )] 
E[iVF^(Q )] 



E[L|L < Q ]. (42) A a (z) = E N [{N - l) a f [N] (x)} 



For the higher order coefficients an analogous derivation 
from (fTTj) yields 



Qk ^ Pn/[T>1 (Qo) = 
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x;( jCfc-.G..)^- 1 ^^) 
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s=2 ^ ' n=0 



where 

oo 
U s (x) =J2Pnf[n](xMQl ~ Y n ) S \X n = X] 
n=0 
oo s / \ 

=£Mn]0zo£ ( s (-i)'gr'M n ,,(x) 

n=0 rj=0 ^' 

M niq (x) =B q ({n - l)«i(x), . . . , (ra - l)K g (aj)) . 



(43) 



= l^\-E[N{N-l) a F(x) N ], for a >0. 
F(x) 

The explicit expressions for the first four coefficients are 

Qo=^ 1 (e- M « 1(Q) ) 

- Ai(Qo) k (n \ 
■ki(Qo) 



(44) 
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Ao(Oo) 
-1 

Ao(Oo) 

-1 
Ao(Qo) 
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To compute the expected values over the frequency, one 
needs to isolate the dependency on N. For this purpose, 
it is convenient to use an alternative representation of the 
Bell polynomials that allows to express moments in terms 
of cumulants using partitions of sets (Appendix DO eq. (|84p) 

«„,,(!)= ^(n-l)l A II]«|b|W, (45) 

keV(q) b6A 

where V{q) is the set of all partitions of the set 1, 2, . . . , q, 
and |A| and |b| denote the number of elements in the sets A 
and b respectively, and Ki b i(x) is the |b|th censored cumu- 
lant of L, as defined in ([2~T1) . 



Q 1 X - 3Q ± Xiki + 3<2i(Aik 2 + A 2 ki 



A1K3 - 3A 2 Kl«2 - A3K1' 



. (48) 
where, to simplify the notation, the dependence on x in the 

X a (x) and Kb(x) has been omitted. 

The functions {X a (x); a = 0, 1, 2, . . .} can also be expressed 
in terms of the moment generating function of N as 

Xa{x) = F(x) ds {9s ~ ^ aMN ^s=io g F( x) fe « > 0. 

(49) 
Explicit expressions for the Poisson and negative binomial 
probability distributions are given in Appendix [B] These 
types of distributions are commonly used in applications. 



3.2 Approximation in terms of frequency moments 
for high percentiles 



The formulas derived in the previous section (|48l) are dif- 
ferent from the standard single-loss approximation [2] and 
corrections thereof [23 H3 [551 130] ■ In this section we show 
that for high percentiles one recovers the single-loss approx- 
imation and correction terms. In the limit a — > 1~ the 
inverse of the moment generating function in (1401) can be 
approximated as 

M N (s) = E [e sN ] = 1 + sE [N] + 0(s 2 ), (50) 

for s — ¥ 0. From this expression, 

Mj^ia) « —^£, for a -> 1" (51) 
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This leads to the standard single-loss approximation [T4] 

1 — a 



Qo ~ Qsl = -F 1 



1- 



E[7V] 



(52) 



In this limit, the survival function S(x) = 1 — F(x) ap- 
proaches 0, and simpler approximate expressions for X a (x) 
are obtained by keeping terms only up to 1st order in S(x) 



X a (x)=d x E[(N-l) a (l-S(x)Y 

&d x E[N{N-l) a (l-NS(x)] 

= -d x S(x)E[N(N-l) a ] 

= /W^( a )(-iri/ 8+1) a = 0,1, 



(53) 



where v s — E[N S ] are the moments of the frequency dis- 
tribution. Using these approximations, the high-percentile 
corrections to the single-loss formula can be expressed di- 
rectly in terms of the moments of the frequency distribution 

'E[7V 21 



Q2 



- 1 E[L\L < Qq] 




(54) 



d x [/(as)Var [L\L < x]] x= 



x=Qa 



W) d x [f(x)E[L\L<x] 2 ] x=Qo 



(55) 



The approximation to Q\ is similar to the corrections to the 
single loss formula proposed in the literature [3"T1 1251 |2"51 I3"U] . 
In section^ we provide a review of these corrections. Their 
accuracy will be compared to the perturbative expansion 
in section [5] To make the numerical computation of the 
perturbative approximation up to high orders feasible it is 
useful to express the terms of the series recursively. These 
recursive expressions are presented in Appendix [Cj 



4 Related work 

In this section we review closed-form approximations for 
the percentile of sums of positive iidrv's that have been 
proposed in previous investigations. Even though it is pos- 
sible to derive approximations for particular heavy-tailed 
distributions, such as [35] for the Pareto distribution, in this 
work we consider comparisons only with approximations for 
general subexponential distributions [TH [T|5] . The single- 
loss approximation can be derived using first order asymp- 
totics of the tail of sums of subexponential random variables 
[371GIH1Q3]- Hig ner order asymptotic expansions of the tails 
of the compound distribution [2"2 l |2"31 |2"t[ [27 1 $E\ [2^ ] can be 
used to obtain corrections to the single-loss approximation 
[231 l2lfl I3T)] . These high order corrections are similar to the 
successive terms in the perturbative expansion analyzed in 
this article. However, there are some important differences. 
In particular, these terms are expressed as a function of 
right-censored moments, which are always finite. In the 
section on experimental evaluation (section [SJ we will fur- 
ther show that the perturbative series provides more accu- 
rate approximations than the expressions introduced in this 
section. 

One of the defining properties of subexponential distribu- 
tions is that large values of sums of subexponential random 
variables are dominated by the maximum 



JV 



Zn — 2_^ Li 



:{Li,...L N }, Z N ->oo. (56) 



In insurance mathematics this corresponds to the 'one loss 
causes ruin' regime [1] . Using the property of subexponen- 
tial distributions [33 EH| 



lim 



P(Li 



Ln > x) 



P(Li > x) 



N, 



(57) 



it is possible to show that, for this type of distributions, the 
percentile of Zn at the probability level a is approximately 

Qsl = F- 1 (l - i^Y fora-»l-. (58) 

In this limit, expression (|58[) is very similar to the zeroth 
order term in the perturbative expansion 



Qo = F- 1 (a^ 



= F~ L 1- 



1-a 

N 



O 



N 



Q 



(59) 



SL- 



The derivation of a closed-form approximation for high per- 
centiles using first order tail asymptotics can be readily ex- 
tended to sums of subexponential iirdv's with a random 



number of terms 



Qsl = F- 1 1 



E[TV] 



(60) 



where E [TV] is the average number of terms in the sum. In 
the area of operational risk, this expression is known as the 
'single- loss approximation' O [21] . 

Using heuristic arguments, a correction to the single-loss 
approximation was proposed in [3X for distributions with 
finite mean 



Q^F- 



l-||^y) -IE[Y]- \) llL . //i=E[I] 



(61) 

In the limit a — > 1 , the value Qo is large, so that 
E [L\L < Qo) ~ E [L] and the approximation given by (|6"Tj) 
becomes similar to (151 



Besides the heuristic derivation given in [31] and the pertur- 
bative expansion proposed in this work, higher order correc- 
tions to the single-loss approximation can be derived in at 
least three different ways: Using the second order asymp- 
totic approximations introduced in [23 [23 [23 [22], from 
the asymptotic expansion analyzed in [23 [23 [22] or from 
asymptotic approximations based on evaluations of F(l) at 
different arguments [50] . 

In the case of distributions with finite mean, the asymptotic 
analysis of the tail of a subordinated distribution analyzed 
in [23] can be used to obtain Qow , a second order approx- 
imation of the percentile of sums of subexponential iidrv's, 
as the solution of 



Qow — F 



1 



a 



E[N] 



'e[n 2 ] 

~~E[N] 



1 Vl! {Qow) 



. (62) 

This implicit nonlinear equation can be solved numerically 
using, for example, an iterative scheme. Alternatively, one 
can retain only the leading terms in a perturbative expan- 
sion of this expression 



sow 



= Q SL + (E[N} + (D-1))^ L , 



(63) 



where D = Var [TV] /E [TV] is the index of dispersion (D = 1 
for the Poisson distribution and D > 1 for the negative 
binomial distribution). The first term in (155]) is the single- 
loss approximation [14, 3T]. The second term is a correction 
that involves the mean and is similar to (l6i~j) when E [TV] ^> 1 
and DkI, As shown in Appendix [Dl expression (|63p can 
be derived in a number of different ways [261 fI7\ 135] fZ§\ |2U] . 

In the case of distributions with infinite mean, in which the 
density is regularly varying at infinity with index — (1 + a), 



f(L) € -RU_(i +a ) [2H], the second order approximation of 
Q = G^ 1 (a) satisfies the relation [22] 



Qow — F 1 — 



1 — a 

E[N] 



/e [TV 2 ] \ ' 

+Ca I E r jv - | ~ 1 I VF(Qow)f(Qow) 



(64) 



where 

fi F (x) = / ds(l ~ F(s)) = (1 - F(x))x + F(x)E [L\ < x] , 
Jo 

(65) 

and 

c » = {a-i/«)^ "<5 - (66) 

where T(x) is the gamma function. Besides numerical 
schemes, an approximate closed- form expression of the per- 
centile, Qq W -, can be obtained using a perturbative scheme 
analogous to the finite mean case 



low 



QsL+c a (E[N} + (D-l))^ F (Q SL ), (67) 



1 - a 
Vf{Qsl) = —r n Qsl 

E[N] 



i -L T ^) E[L \ L <Q SLl 



E[N] 



(68) 



Appendix [D] presents the detailed derivations of these 
approximations and the connections with the perturba- 
tive approach introduced in the current article. The 
main difference with previous proposals is that the 
perturbative expansion involves the moments of right- 
truncated distributions. Since these censored moments are 
always finite, the same expressions are valid for distribu- 
tions with finite and with infinite mean. As illustrated 
in the following section, the perturbative expansion pro- 
vides accurate approximations of high percentiles of sums 
of iidrv's for a variety distributions and a wide range of 
parameters, regardless of whether the mean of the random 
variables in the sum is finite or infinite. 



5 Empirical evaluation 

In this section we investigate the properties of the pertur- 
bative expansion of the a-percentile of the aggregate dis- 
tribution introduced in this work, when a is close to 1. 
The accuracy of this perturbative expansion is compared 
to the second order asymptotic approximations (|62M67j) for 
different types of distributions and different values of a. 
The types of distributions, ranges of parameters and per- 
centile levels used to carry out the empirical evaluation of 



the proposed approximations are in the range of those com- 
monly used in applications in insurance and finance [H [5] , 
especially in the area of operational risk [71 [51 E] • The 
derivation closed-form approximations for the estimation of 
high percentiles in these areas of application is extremely 
relevant because of the large computational costs of the 
standard methods, such as MC simulation, which are used 
to compute the risk measures. 

The comparisons among the different approximations are 
made in terms of the relative error (Qapprox ~ Q)/Q, where 
Qapprox is an approximation of the percentile (either Qow 
Q* ow or Q^ K \ the truncation of the perturbative series at 
order K), and Q is the exact percentile. The sign of the 
error is retained in most cases to make it clear whether the 
approximation over- or underestimates the true value of the 
percentile. When the true value of the percentile cannot be 
computed exactly, it is estimated via Monte Carlo simula- 
tion. Due to the heavy-tailedness of the severity distribu- 
tions considered, many simulations are required to achieve 
sufficient precision in the percentile estimation. The Monte 
Carlo estimates have been obtained using OpVision(f?Q a 
software system for the analysis and quantification of op- 
erational risk in the Advanced Measurement Approaches 
(AMA) framework [40 . In all cases, the error of the Monte 
Carlo estimates is at most 0.1% at a 95% confidence level. 
If the approximations analyzed are more accurate than this 
threshold, more simulations are performed to obtain reli- 
able estimates of the accuracy. Error bands for the Monte 
Carlo estimates are displayed in all the graphs except for 
the Levy case, where the percentiles can be calculated ex- 
actly. In many cases these sampling errors are much smaller 
than the errors of the approximations considered and this 
band cannot be discerned in the plots. 

The recursive formulas used for the calculation of the terms 
in the perturbative expansion are given in Appendix [Cj The 
computational cost of obtaining an approximation with K 
terms is 0(K 4 ), where K is the order at which the per- 
turbative series is trunctated. An implementation in Mat- 
Lab of the perturbative expansion is publicly available q 
In the experiments reported, the computations are numer- 
ically stable. However, numerical instabilities eventually 
appear for higher orders, higher quantiles and/or heavier- 
tailed distributions. 

The convergence properties of the perturbative series are 
also of great importance. Even though a formal analysis 
of this question is beyond the scope of this work, we have 
carried out an empirical investigation of the accuracy of the 



approximation as a function of the order at which the per- 
turbative expansion is truncated. The results reported are 
for sums of a fixed number of lognormal iidrv's. Nonethe- 
less, similar patterns are obtained for other distributions 
(e.g. Pareto) in other ranges of parameters and in sums of 
iidrv's with random numbers of terms. In Figure [1] the rel- 
ative error of the quantile estimations for a sum of N = 100 
lognormal iidrv's is plotted as a function of the order of the 
perturbative expansion, for different quantile levels. From 
these results it is apparent that the series converges only 
asymptotically for a — > 1~. The asymptotic behavior of 
the series is analyzed in detail for the particular case of the 
Pareto distribution in section 15.3.31 For a fixed quantile 
level, the accuracy of the approximation initially improves 
as more terms are included in the expansion, but becomes 
worse beyond a certain order. Nonetheless, for a given or- 
der, there is a quantile level above which the series trun- 
cated to this order is a more accurate approximation than 
the series truncated to lower orders. As heavier tails imply 
stronger dominance of the maximum in the sum, the heav- 
ier the tails of the distribution, the more accurate of the 
approximation becomes. Hence, the order beyond which 
the approximation deteriorates is larger for distributions 
with heavier tails. Finally, the accuracy of the perturbative 
expansion becomes poorer for increasing N. 

In summary, in the cases analyzed, the accuracy of the ap- 
proximation initially improves as more terms are included 
in the perturbative approximation. However, beyond a cer- 
tain order, adding further terms in the expansion leads to an 
increase of the error. In the experiments carried out in the 
remainder of this section, the series is truncated at interme- 
diate orders (K = 3 or K = 5), which, for the considered ex- 
amples, provide very accurate approximations. The results 
of these experiments are presented in separate subsections, 
each of which corresponds to different types of distributions 
of the individual random variables in the sum. 
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Figure 1: Relative error for Lognormal (a = 2.5) with frequency 
N = 100, as a function of the coefficient order for a = 90% 
(upper plot), a — 92.5% (middle plot) and a — 95% (lower 
plot). The horizontal lines delimit the 95% confidence interval 
of the Monte Carlo simulation of the exact quantilc. 

5.1 Levy distribution 

In this section we evaluate the accuracy of the different 
approximations of high percentiles of the sum of iidrv's that 
follow a Levy distribution 



c 1 



2tt x 3 / 2 



/(*) 



F(x) = erfc [ \l — ) for X > 0, 



(69) 



where erfc(y) is the complementary error function. The 
mean of the Levy distribution is infinite. The probability 
distribution, F(x), is a function of regular variation RV- a 
and the density, f(x), is RV_n +a \) with a = 1/2. This 
a particularly useful case to analyze because the Levy dis- 




0.9 0.95 0.975 0.99 0.995 0.9975 0.999 
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Figure 2: Absolute value of the relative error of the different 
approximations to the a-percentile of the sum of TV independent 
identically distributed Levy random variables as a function of a 
for N = 100 (upper plot) and N = 1000 (lower plot). 

tribution belongs to the family of stable distributions 32 . 
Therefore, the sum of N Levy independent identically dis- 
tributed (iid) random variables Zn 
the Levy form 



^JY 



2 i=1 £j, is also of 



/ C , r 1 _cjv£ 



In the case of deterministic N, the a-percentile is 

r2 (,..,c-l 



(70) 



Q = ^N 2 [err 1 (l-a)] 2 . 



(71) 



For Levy random variables, c a = in (jMJ) because a = 1/2. 
In consequence, the second order asymptotic approximation 
(l64|) coincides with the single-loss approximation 



Q 



ow 



ow 



erf 



Qsl = F- 1 {1 

i \ 1 -2 

1 — a 



1-a 

N 



N 



(72) 



The accuracy of this approximation is compared to the per- 
turbative series up to order 5. Figure [5] displays in a loga- 
rithmic scale in both axes the absolute value of the relative 
error of the different approximations as a function of a for 
N = 100 and N = 1000. All approximations become more 
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terms. The dependence of the relative error with N, the 
number of terms in the sum, for a = 99% (upper plot) and 
a = 99.9% (lower plot) is shown in Figure E3 The relative 
error increases with N. Nonetheless, the deterioration is 
fairly slow. The error eventually approaches a constant, in 
agreement with the large N behavior of (|75|) . Also in these 
cases the perturbative series is more accurate that Qow- 



5.2 Lognormal distribution 

In this section we analyze the sum of iidrv's that follow a 
lognormal distribution 



F{x) 



1 



X<7\/27T 
1 1 



exp 



(logs) 5 



2a 2 



flogx 

— i — erfc ■= 

2 2 \aV2 



(76) 



for x > 0. 



Figure 3: Relative error of the different approximations to the 
percentile of the sum of iid Levy random variables as a function 
of the number of terms in the sum for a — 99% (upper plot) and 
a = 99.9% (lower plot). 

accurate for higher percentiles (a ~ > 1~). In this limit the 
relative error is proportional to (1 — a) for all the approx- 
imations considered. Using the results of Appendix [E] the 
relative error of approximation (|72p is 



Qow-Q tt TV 2 -! 2 



(73) 



Similarly, for the perturbative expansion truncated at dif- 
ferent orders 



Q (fc) - Q 



with 



Q 

7i = 

72 = 

73 = 



7 fe (1 -a)* 



-►I - , k = 1,2,.. 



(74) 



(2tt - 5)iV 2 - 6(tt - 3)7V + (4tt - 13) 



(N - l)(N-2) 

67V2 
(iV-l)(JV-2) 

6N* 



12A^ 2 
(tt-3) 



16 



(75) 



Up to the orders analyzed the perturbative series pro- 
vides more accurate estimates than (|72[). improving with 
the number of terms included in this series. Nonetheless, 
the relative improvements become smaller for higher order 



The lognormal is also subexponential. However, in contrast 
to the Levy distribution, all its moments are finite. The 
perturbative series, which is of the same form as in the 
previous case, also provides very accurate approximations 
of high percentiles of the sum. 

Figure 3] displays the relative error of the different approx- 
imations as a function of a. Larger values of a correspond 
to heavier tails. In the simulations the number of terms in 
the sum (frequency) is random and follows a Poisson dis- 
tribution whose mean is A = 100. In all cases, the relative 
error becomes smaller as a increases. This is consistent 
with the fact that this parameter determines the heaviness 
of the tail. For larger values of a (heavier tails) the relative 
importance of the maximum in the sum increases and the 
approximations, which are based on the dominance of the 
maximum in the sum, become more accurate. 

The second order asymptotic approximations Qow ano - 
Qow diverge as a becomes larger. This is not unexpected 
because the mean of the distribution increases as e CT ' 2 , 
while the percentile of the maximum (which dominates the 
sum) increases only as e CT . The perturbative expansion in- 
troduced in this work, which involves only censored mo- 
ments, avoids this problem and behaves properly. Figure [S] 
displays the dependence of the error of the different approx- 
imations as a function of the percentile level (upper plot) 
and of the average frequency (lower plot). As expected, all 
approximations perform better at higher percentiles and 
lower frequencies; that is, as the weight of the maximum in 
the sum becomes larger. Even for the relatively high aver- 
age frequency A = 1000, the accuracy of the perturbative 
approximation Q 1 - 3 ' is remarkable. 
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Figure 4: Relative error for Lognormal /Poisson (A = 100) as Figure 6: Relative error for Pareto / Poisson (A = 100) as a 
a function of a for a = 99% (upper plot) and a = 99.9% (lower function of a for a — 99% (upper plot) and a — 99.9% (lower 
plot). plot). The values of a are ordered so that the heaviness of the 

tails increases from left to right in the plots. 
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Figure 5: Relative error for Lognormal(cr = 2) /Poisson as a 
function of a for E[iV] = 100 (upper plot) and as a function of 
E[N] for a = 99.9% (lower plot) 



5.3 Pareto distribution 

In this section we analyze the sum of iidrv's that follow a 
Pareto distribution 



m 



F(s) = l-i x>l, (77) 



with a > 0. Since the second order asymptotic approxi- 
mations have a different form depending on whether the 
mean is defined or not, we consider two separate regimes: 
a > 1, where the mean of the Pareto distribution is finite, 
and a < 1, where the mean diverges. It is worth noting that 
the perturbative expansion introduced in this work has the 
same expression in both regimes and is in fact continuous 
at a = 1. 



5.3.1 Pareto distribution with finite mean (a > 1): 

We now compare the accuracy of the different approxima- 
tions for sums of random variables that follow a Pareto 
distribution with finite mean using Monte Carlo simula- 
tions. Figure |5] displays the relative error as a function of 
the Pareto index a. In the limit a — > 1 + the second order 
asymptotic approximations Qow and Q*q W diverge. The 
origin of this divergence is the increase of correction term 
which involves the unconditional mean of the 
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Pareto a = 2.00, E[N] = 100 



Pareto a = 2.00, a = 99.9% 





Figure 7: Relative error for Pareto / Poisson ( A = 100 ) as a Figure 8: Relative error for Pareto/Poisson as a function of 
function of a for different values of a: a — 2.00 (upper plot) and E[iV] for a = 99.9% and different values of a: a — 2.00 (upper 
a = 1.20 (lower plot). plot) and a = 1.20 (lower plot). 



distribution. This mean which grows without bound as a 
approaches 1 from above. By contrast, the perturbative ex- 
pansion, which is expressed in terms of censored moments, 
behaves well and actually becomes more accurate in this 
limit. Figure [7] presents the dependence of the relative er- 
ror as a function of a. The dependence on the average 
frequency A = E[iV] is shown in Figure [8] In all cases the 
conclusions reached through the analysis of these results are 
similar to the lognormal case. 



0.90 the errors of this approximation are below the un- 
certainty of the Monte Carlo estimates. The improve- 
ments with respect to the standard approximations, Qow 
or Q* ow , are especially significant for values of a close to 1. 



5.3.2 Pareto distribution with infinite mean (0 < 

a < 1): 

We now evaluate the accuracy of the different approxima- 
tions for the percentiles of sums of random variables that 
follow a Pareto distribution with infinite mean. Figure [5] 
displays the relative error of the different approximations 
as a function of a, the tail parameter of the Pareto distri- 
bution. Figure ITU1 plots the relative error as a function of a 
for two different values of a. Finally, the change in relative 
error as the average frequency E[JV] varies is presented in 
Figure [TT1 In this regime all approximations are fairly accu- 
rate. Between the second order asymptotic approximations, 
Qow is more accurate than Q ow . 

For high percentiles, the best results corresponds to Q^ 3 \ 
the third order perturbative approximation. Beyond a = 
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Figure 9: Relative error for Pareto/Poisson (A = 100) as a Figure 11: Relative error for Poisson/Pareto as a function of 
function of a for a = 99% (upper plot) and a = 99.9% (lower E[iV] for a — 99.9% and different values of a. 
plot) . The values of a are ordered so that the heaviness of the 
tails increases from left to right in the plots. 

5.3.3 Effective expansion parameter 
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100 and different values of a. 



Equation ([2]) has been derived using a purely formal ex- 
pansion parameter e, which is eventually set to 1. In this 
section we take advantage of the simple form of the Pareto 
distribution to identify the actual perturbative parameter 
of the expansion for this type of random variables. To this 
end, we analyze the leading contributions in the individual 
terms in the expansion for a — >• 1 _ . In terms of the param- 
eter S = (1 — a), the leading contributions for 6 — >• + and 
for all non-integer a ^ i are 



Qi 



N- 1 



Q-2 



N - 1 



N- 1 



a- 1 \N 



l-l/a 



O/o 



a- 1 \N, 
o(2o-l) ( 5\ l ~ 1,a 



o-2 V^ 

a(a+ 1) 



l/a 



(a- l) 2 (a-2) V^V, 
(),, 2a(a-l)(2a-l) ^\ 1_1/a 

2/a 



+ 



a — 3 

2a(a + l) 2 (a + 2) / S_ 
(a-l) 3 (a-2)(a-3) ViV 



(78) 
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The pattern that emerges is the following: up to order Qk, 
with k < a, the terms (5/N)( k ~ x " a dominate. Therefore, 
for k < a, (S/N) 1 / 11 can be interpreted as an expansion 
parameter. For k > a the terms proportional to (S/N) 1 ^ 1 / 11 
dominate. Since these terms are independent of k, there is 
no longer a recognizable expansion parameter. However, 
the prefactors, which depend on a, become smaller as the 
order of the perturbative term increases. For k — a both 
types of terms contribute. It is interesting to note that 
the dominance shifts precisely at the order in which the 
moments cease to exist. 



investigation. 
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6 Conclusions 

Starting from a perturbative expansion for the percentile 
of a sum of two random variables we derive a formal ex- 
pansion for the percentile of sum of N independent random 
variables. Assuming that, for sufficiently high percentiles, 
the maximum dominates the sum, the expansion is carried 
around the percentile of the maximum in the sum. This ze- 
roth order term in the perturbative series is similar to the 
single-loss approximation |14j , which can be derived from a 
first order asymptotic analysis of the tails of sums of subex- 
ponential random variables [38] . The first order perturba- 
tive correction is similar to the mean-corrected single-loss 
formula for distributions with finite mean [31] . which can 
also be derived using higher order asymptotics. Higher or- 
der terms in the perturbative series are expressed in terms 
of right-truncated moments. These censored moments are 
always finite, regardless of whether the original uncensored 
distributions have finite or divergent moments. The pertur- 
bative series becomes more accurate for higher percentiles 
and heavier tails. From the empirical study carried out 
using either exact results or Monte Carlo simulation, one 
concludes that the perturbative approach is more accurate 
than previous approximate formulas proposed in the liter- 
ature [351 H3 EH]. Furthermore, the accuracy of the ap- 
proximation can be improved by including more terms in 
the perturbative series, up to a certain order. Beyond this 
order the approximation error generally increases. Another 
practical difficulty is the computational cost of the com- 
putations of higher order terms. Nonetheless, the third or- 
der approximation is sufficiently accurate for the percentiles 
(99 — 99.9%), and the types of distributions that are used in 
practice in many fields of application, such as finance and 
insurance. As an extension of this research, the perturba- 
tive analysis is being applied to sums of random variables 
that are not identically distributed and may have depen- 
dencies. A more detailed analysis of the convergence of the 
perturbative series and the development of accurate approx- 
imations for lower percentiles are also the subject of current 
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A Complete Bell polynomials B Explicit formulas for particular fre- 

quency distributions 

The complete Bell polynomials (CBP) (named after Bell, 

|H]) arise in many contexts, such as the n-times different- i n this section we provide explicit formulas for the first 
ation of a function (Faa di Bruno formula) or to express the terms in the perturbative series when the number of terms 
relationship between moments and cumulants in statistics. i n the sum is distributed as a Poisson or as a negative bi- 

Let z(t) be an arbitrary function of t whose fc-th derivative nomial - 
z( k \t) = jjprz(t) the complete Bell polynomial of order k is 

]k 

dt k ' 

We consider the particular case where N, the number of 
From this definition, the CBP can be shown to satisfy terms in the sum q follows a p oisson distri bution with 

/ °° t p\ °° t q parameter A = E[N] 

expl^^p— j = l+2jS g (x 1 ,...,x g )— . (80) 

9=1 . q ' Pn = h re ~ x - (85) 

This expression provides a relationship between the power 

series expansion of the moment generating function and the The moment generating function is 

cumulant generating function. In partucular 

" p, , , H1 s M N (s)=exp(X(e s -l)). (86) 

/J, q =B q (Kl,...,K q ), (81) 

where n q are the moments of a random variable and n p its ilrom tnls we derive 

cumulants. Xq ^ = exp (\(F(x) - l))A/(x) 

In this paper we use a centered version of the CBP, which Ai(a;) = XF(x)Xo(x) (87) 

is defined by C k (x 2 , . . . ,x k ) = B k (0,x 2 , . . . , x k ). In terms w x ) _ n + XF(x))Xi(x). 

of C k (x 2 , ■ ■ ■ ,X k ) the complete Bell polynomial of order k 

is The first three terms of the perturbative expansion are 

B k (x 1 ,...,x k ) = J2( k ^xtC k _ s (x 2 ,...,x k - s ) (82) Qo = F -ifk|p + 1 



,fe B.0.4 Poisson distribution 

B fc (^(t),...,zW(t)) = e-*W^e*W (79) 



s=0 



The CBP satisfy the following recursive formulae 



Q 1 = (A + loga)E[i|i<Q ] 
1 



fc = 0, B = l , C = l Q2 = -j^d x (\ 1 (x)( K2 (x) + K 1 (x) 2 )) x=Qo 

fc = l, B 1 (xi)=xi , Ci=0 ( f'(Qo)\ „„ , x , wrr2| 



,..,,,, . =- A/(Qo) + 77^ (log« + A)E[^|L<Q ] 



*"* 'fc-1 



s=l 

C k (x 2 ,...,x k ) 

fc-2 



^2[ s _ 1 jx s B k _ 8 (xi,...,x k - s ) (83) 



= 2 



- \f(Qo)Ql 

where, in the last step, we have used the identity 

9xlh(x) = 44 (x p - fx p (x)) . (88) 






for the censored moments fJ* p (x) = E[L P |L < a;]. 

There exists an alternative representation for the CBP, 

which is related to the structure of the partitions of a set ... ....... 

r . B.0.5 The negative binomial distribution 

of size n 

B k (xi,...,x k )= y^ FT^ibi ( 84 ) 

■'— ' xi - The probability mass function of the negative binomial dis- 

tribution with parameters (p, r) is 
where V(k) is the set of all partitions of the set {1, . . . , k} 

(if k — 0, V{k) contains one empty set) and |b| denotes the fn + r — l\ r 

number of elements in set b. Pn = ( n J p ^ ~ p )™' ( 89 ) 
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Setting q = 1 — p The moment generating function is 

M N {s)=p r [l-qe s ]- r . (90) 

In terms of £(x) = 1 — qF(x) we have 
A (x) =p r £(xy r ~ 1 qrf(x) 

\i(x) = C^rVl + r)F(x)X (x) (91) 

\ 2 {x) = £,(x)- 1 [1 + <?(1 + r)F(x)} \ x (x) 

The first three terms in the perturbative expansion are 



By defining the operators |rj (,l) = ^^| e =o, n > o\, the 
terms of the perturbative expansion can be obtained by 
solving the equations 



n {n) (j){s,x) 



= , for all n > 0. 



(97) 



s=0,x=Q 

The sequence of operators fjW has the recurrence relation 

n<°> = o 

r>« = a, 



Qo = F- 

Qi = (l + r) 

r + l 



fc-2 



Q« 



o 



1/r 






1 E[L|L<Q ] 



/) 3 



for fc > 2 and with 9 S = Q\~ d s . Expressing each operator 
f2( n ) in the form 



E [L 2 \L < Qo] h{\ - h) (q(r + 2)/(Q ) + h 
+ E[L\L < Q } 2 (1 - h) 2 ( q(r + 3)f(Q 



/'(Qo 

/(Qo 

. /'(Qo) 

/(Qo) 



i=0 j=0 



(99) 



+ E [L\L < Qo] 2Q qh(l - h)f(Q ) + Q 2 qh 2 f(Q 
where q = 1 — p and h = pa~ 1 ' r . 



(1981) can be expressed as recursion relations for the coeffi- 
cients 

w o,o — Vfe 

, ,W - n 
w i,o - u 



(92) 



fc-2 



..(fc) _ 
W 0,j - 



C Recursive formulas for the perturbative 
series 

The objective of this section is to derive recursive expres- 
sions for the terms in the perturbative expansion of high 
quantiles of Z = X + eY. These expressions are better 
suited for the numerical computation of the series than the 
expressions derived in section [5J 
The starting point is ((SJ. By defining the function 



,(fc) _ 






I — max(l ,j— 1) 
fc-2 

E 

I — max(l,i,j — 1) 



for i)j > 1. Finally, the terms in the perturbative series 
can be derived from (H?T1) as 



1 n n 

ototEE-^^ 



,£(0,0) ^/1^~*J 
Y i=0 3 = 1 



(101) 



(s,x) = f x (x)M Y \x(s\x), 



(93) 



where 



M Y \x(s\x) 



dye s yf Ylx (y\x) (94) 



is the moment generation function of Y conditional on X, 
and the operator 

n e = U s Q-* d °) d * - 1) d-\ (95) 

(|SJl can be written as 

^MUo,*=Q = «• (96) 



where 

^ = dldi<P(s,x) . (102) 

s=0,x=Q a 

The remainder of this appendix is devoted to the deriva- 
tion of explicit recursive formulas for the quantities (fy^'i' 
of the perturbative expansion for sums of N independent 
random variables, Zm — J2n=i L n - These independent rv's 
are identically distributed according to F(l) (density /(/)). 



C.l Deterministic N 

Consider the case of sums of N iidrv's, with N fixed. In 
this case, the expansion around the maximum of the terms 
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in the sum is characterized by 

f x {x) = NF{x) N - l f{x) 

M Y \x{s\x)=M L { S \x) N -\ (103) 

Therefore 

0(s, x) = Nf{x)F{x) N - 1 M L {s\x) N - 1 . (104) 

To make the notation more compact, the following defini- 
tions are used in the derivation 

iN-X, 



for j > 1. Besides m^ , the censored moments with thresh- 
old x — Qo, the remaining terms in the calculation, namely, 
the derivatives of logarithm of the severity F^ and of the 
density of the maximum d^fx(x) = d* (Nf(x)F(x) N ~ 1 ) 
can be readily computed from the derivatives of the sever- 
ity CDF, also via recursion. For instance 



j-i 



diF(x)-J2[ 3 , 1 )d k x F{x)F^< 



?|5 - # 

mf = dim{x)\ x=Qo = dldiM L (s\x)\ s=0x=Qo 
PW = dilogF(x)\ 0n 



M^ x =dldlMZ-\s\x 



pU) 



C.2 Random N 



k=\ 



F(Qo) 



-.(111) 



/^s^log/Ca;)! 



z=Qo ' 



where 



A^(s|x)= C dle sl ^- 
Jo F{x) 



(105) 



(106) 



In this case 

M Y \x(s\x) =E[e SY \X = x] 



is the generating function of the censored moments of the 
individual terms in the sum with censoring threshold x. 

Using these expressions and definitions, the coefficients in 
(fTUTj) are 



^2M Y \x,N(s\x,n) 

n=0 
oo 

J2M l (s\x) 



{i,0) 



= J2J2(-v 



1=0 k=0 



.-it J 
kj \l 



' A Q[M^- k) d k x fx(x)\ 



fx n (x)p r , 

fx(x) 

n-l fx n {x) Pn 

fx(x) 
with p n = P[N = n]. In terms of these quantities 

4>{s,x) =E[f XN (x)M L (s\x) N - 1 ] . 
The coefficients in (1101)) are then given by 



n=0 



(112) 



(113) 



(107) 

The derivatives of the conditional moment generating func- 
tion A4y\x(s\x) evaluated at s — and x = Qo can be 
computed using the recursion 



/=n 



^^ECe'^^'^'^f^w^ 1 ^^ 



s=0 



1=0 



=e(; jQic-i)*-^. 



i-l t (i-l,j) 

> 



A/f (0,i) _ r 



L r|x 



X 



FIX 



»-l 3 



(*-i)EE 



/ 



lU i)^i5*H.T fc) (los) 



i=0 fc=0 

for j > 0, i > 1 and with Sij the Kronecker delta. To evalu- 
ate this expression one needs the derivatives of the censored 
cumulants evaluated at Qo- These can be computed using 
the recursion 

k^ = 



where 

Z™ = dldiE[(N- l) a f XN (x)M^-Hs,x) 
These quantities have the recursion 

9lK(x)\ x= 



x=Q 
s=0 



(114) 



(115) 



&>»=diE[(N-irf XN (x) 

i—l j 

eF=EE 



x=Q 



k9 ] = m!:P 



i—l j 

1=1 k=0 



i - 1 \ a 
i 



(<0 i U-k) 

,m, k) , 
k 



(109) 



for j > 0, i > 1. Finally, the derivatives of the censored 
moments evaluated at Qo are given by the recursion 



roP = 



k=0 v 7 



,0-i-fe) 



(110) 



• l '' J ' l =^^7 1 )(i)&i ) *HT fc, *>i.i>o die) 

where the coefficients X a (x) have been defined in (1471) . 
Their values at a; = Qo can be computed using equation 
(|49p in terms of the derivatives of the moment generating 
function. To obtain the derivatives Xa = 9^A a (a;)| _ in 

x v ' I x=Qo 

the previous equation, the following recursion can be used 
A ( fe ) = g ^ " 1^ (;(W) A (*-<-i) + ^h-D^-D). 

(117) 



IS 



The remaining elements in the calculation (moments, cen- Identifying terms of the same order, 
sored cumulants and their derivatives etc.) are computed 
as in the case with deterministic N. 



D Derivation of higher order asymptotic 
approximations 



l-a = E[JV](l-F(Q' )) 
O'o = F' 1 ( 1 



1 



E[N] 



= (E [N] Q\ - E [N(N - 1)] n L ) f(Q' ) 



E[N] 



-1 Mi) 



(122) 



(123) 



In this section we present the derivations of the single-loss 
approximation and higher order corrections that have been 

given in the literature. [22j[23l[25l[26l[22[2S[29l[3Q] which provides a good approximation to the solution pro- 

vided that f(Q' ) > and Q' x < Q' . Therefore, the 
D.l Second order approximation by Omey and approximate solution of (E3J) with e = 1 is 
Willekens [22l 125] 

g«F- 1 fl-i-^)+(E[7V] + p_i)) AiL , (124) 



It is possible to derive corrections to the single-loss approxi- 
mation by using the second order behavior of the tail proba- 
bility of subordinate distributions [251 123] ■ These references where D = Var [N] /E [N] is the index of dispersion (D = 1 
are also the basis for the analysis presented in [351 US] ■ f° r the Poisson distribution and D > 1 for the negative 

. . . binomial distribution) . The first term in (1124(1 is the single- 

boi the case in which the mean is finite, the second order , r , mi i , ■ , • ,i , • i 

».,..,,...,.. r , „„ loss formula. 1 he second term is a correction that involves 

approximation tor the tail distribution of the sum is \26\ 



l-G(jc) ~ E [N] (1 - F(x))+E [N(N - 1)] /j, L f(x) x -> oo 

(H8) 
From this, it is possible to derive a nonlinear equation for a 
second order approximation of Q = G^ 1 {a), the percentile 
of the sum at the probability level a 



Q^F- 



1- 



1 — a 



'e[n 2 ] 



-1 M£/(Q) 



E[N] \ E[N] 
This nonlinear equation can be solved numerically. 



(119) 



the mean. 

Similar approximate formulas can be given for the case 
of distributions F(L) with infinite mean and whose cor- 
responding density is regularly varying f(L) € i?VL(i+ a ) 
using the results of [32] 

1 - G(x) ~ E [N] (1 - F(x)) + c a E [N(N - 1)] ix F {x)f{x) 
for x —t oo, (125) 

where 



A closed-form expression that is similar to the correction by 
the mean p 
solution of 

1 - a ~ E [N] (1 - F{Q)) + eE [N(N - 1)] fx L f{Q) (120) 



A closed-form expression that is similar to the correction by „ x 

the mean proposed in [31] is obtained using an approximate Hf{x) = / ds(l — F(s)) = (1 — F{x))x + F(x)E [L\ < x] , 

(126) 



and 



where the parameter e — 1 has been introduced to order 
the terms in a perturbative expansion of the solution 



(1 l/a) [r(1 ~ a)]2 
(i J-/ a) 2 r(i-2a) 



0= 1 

a < 1 



(127) 



Expanding (|120p up to first order in e we obtain 

1 



(121) 



In this case a second order approximation of Q = G 1 (a) 
can be obtained from 



a - E^Kl-JXQ'o + eQ'i + ...)) 

+eE[N(N-l)]fi L f(Q' + eQ' , + ...) 
1-a = E[iV](l-F(Q' )) 

-e(E [A] /(Q'o)Q'i - E [A(A - 1)] Mi/Wo! 

+G(e 2 ) 



g^F- 1 i 



i 



E[N] 



'e [N 2 ] 
EpV] 



i MQ)f(Q) 



(128) 
Again, this nonlinear equation can be solved numerically us- 
ing, for example, an iterative scheme. Alternatively, an ap- 
proximate closed-form expression can be obtained by means 
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of a perturbative scheme analogous to the finite mean case diffusion kernel 



Q' + c a (E[N] + (D-l))MQ'o), (129) l-G(x)^X 



Q'o = F- 1 1- 



1-Q 

E[iV[ 



MQ'o) = / (i-F( s ))ds 

Jo 



(130) 



rf/2 



2ttA/4 



exp ■ 



(z - x + A/^lT 

2vF~ 



F(z) 



(136) 



' ■- a -Q> 0+ (i-L-£)E[L\L<Q' ], 



E[N] 



E[N] 



(131) 



The corresponding second order approximation (m = 2) 
for Q, the a percentile of G is the solution of the nonlinear 
equation 



1 — a = A 



D.2 Asymptotic expansion by Barbe and Mc- 
Cormick [27l l28l 129] 

This section uses the approximations for the distribution 
of sums of independent random variables with heavy tails 
derived in [271 I2H \22\ ■ For simplicity, we assume that the 
number of terms in the sum are sampled from a Poisson 
distribution. Assuming that the first m moments of the 
variables in the sum are finite 



l-G(x)=\cxpl\f2 { -^^dl\[l-F(x)] 

+ 0{h m {x)[l-F{x)]) 1 (132) 

where h{x) = f(x)/(l - F(x)) and (if = E[L l ] = 
L dxf(x)x l is the ith moment of L. For m = 0, the 
single-loss approximation is recovered. The first order ap- 
proximation (m = 1) is 



1 



1 



exp ■ 



2n\n [ £ ] 



(z — Q + X^lY 



2X^ 



2] 



F{z) 



(137) 



D.3 Asymptotics with a shifted argument 

The results of this section are based on the expansion for 
G derived in 30 using only evaluations of F at different 
arguments 

1 - G{x) w E [N] (^F(x -ki) + ... + i m F(x - km)) 

(138) 
for some constants ^i, . . . , £ m , hi, . . . , k m . Assuming that 
the first m moments of F are finite, these constants are the 
solution of the system of equations 



1 - G(x) w Ae -Aw - a " [1 - F(x)] 



(133) 



In [2H1 [SH] the authors proceed by preforming a Taylor ex- 
pansion of the right-hand side of (|133|) . Here, we derive an 
exact formula by realizing that the Taylor expansion can 
be resummed. This resummation results in a translation of 
the argument of F 







L 


in 

£(c<-E[JV]fcj)& = l;*=l,...,m, (139) 


= E 


NiXi + .-. + XN) 1 


; for > 0. There is 



some freedom in the choice fci , . . . , k m . In [30] the authors 
propose to determine the values of these parameters by en- 
forcing the constraints 



1 - G(x) w A [1 - F(x - \hl)] 



(134) ^ 



E [N] fc™ +l ) tj =0, for i = 1, . . . , m - 1. 



Therefore, the first order approximation to the a percentile 
of G yields the correction by the mean 



.7=1 



Q* r ' [l-L^L)+\ tlL . 



(140) 

Therefore, the approximation of order m is obtained by 
solving the set of nonlinear equations 



(135) 



also in this derivation. The second order approximation 
for G can also be expressed in terms of an integral over a 



> £jkj = Ci, for i = 0, . . . , 2m — 1. 

i=i 

where Cj = Cj/E [A^] for i > 0. 



(141) 
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For m = 1 
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